Goto

Collaborating Authors

 training view


AdLift: Lifting Adversarial Perturbations to Safeguard 3D Gaussian Splatting Assets Against Instruction-Driven Editing

Hong, Ziming, Huang, Tianyu, Chen, Runnan, Ye, Shanshan, Gong, Mingming, Han, Bo, Liu, Tongliang

arXiv.org Artificial Intelligence

Recent studies have extended diffusion-based instruction-driven 2D image editing pipelines to 3D Gaussian Splatting (3DGS), enabling faithful manipulation of 3DGS assets and greatly advancing 3DGS content creation. However, it also exposes these assets to serious risks of unauthorized editing and malicious tampering. Although imperceptible adversarial perturbations against diffusion models have proven effective for protecting 2D images, applying them to 3DGS encounters two major challenges: view-generalizable protection and balancing invisibility with protection capability. In this work, we propose the first editing safeguard for 3DGS, termed AdLift, which prevents instruction-driven editing across arbitrary views and dimensions by lifting strictly bounded 2D adversarial perturbations into 3D Gaussian-represented safeguard. To ensure both adversarial perturbations effectiveness and invisibility, these safeguard Gaussians are progressively optimized across training views using a tailored Lifted PGD, which first conducts gradient truncation during back-propagation from the editing model at the rendered image and applies projected gradients to strictly constrain the image-level perturbation. Then, the resulting perturbation is backpropagated to the safeguard Gaussian parameters via an image-to-Gaussian fitting operation. We alternate between gradient truncation and image-to-Gaussian fitting, yielding consistent adversarial-based protection performance across different viewpoints and generalizes to novel views. Empirically, qualitative and quantitative results demonstrate that AdLift effectively protects against state-of-the-art instruction-driven 2D image and 3DGS editing.


PUP 3D-GS: Principled Uncertainty Pruning for 3D Gaussian Splatting

Hanson, Alex, Tu, Allen, Singla, Vasu, Jayawardhana, Mayuka, Zwicker, Matthias, Goldstein, Tom

arXiv.org Artificial Intelligence

Recent advances in novel view synthesis have enabled real-time rendering speeds with high reconstruction accuracy. 3D Gaussian Splatting (3D-GS), a foundational point-based parametric 3D scene representation, models scenes as large sets of 3D Gaussians. However, complex scenes can consist of millions of Gaussians, resulting in high storage and memory requirements that limit the viability of 3D-GS on devices with limited resources. Current techniques for compressing these pretrained models by pruning Gaussians rely on combining heuristics to determine which Gaussians to remove. At high compression ratios, these pruned scenes suffer from heavy degradation of visual fidelity and loss of foreground details. In this paper, we propose a principled sensitivity pruning score that preserves visual fidelity and foreground details at significantly higher compression ratios than existing approaches. It is computed as a second-order approximation of the reconstruction error on the training views with respect to the spatial parameters of each Gaussian. Additionally, we propose a multi-round prune-refine pipeline that can be applied to any pretrained 3D-GS model without changing its training pipeline. After pruning 90% of Gaussians, a substantially higher percentage than previous methods, our PUP 3D-GS pipeline increases average rendering speed by 3.56$\times$ while retaining more salient foreground information and achieving higher image quality metrics than existing techniques on scenes from the Mip-NeRF 360, Tanks & Temples, and Deep Blending datasets.



Structure Consistent Gaussian Splatting with Matching Prior for Few-shot Novel View Synthesis

Neural Information Processing Systems

Despite the substantial progress of novel view synthesis, existing methods, either based on the Neural Radiance Fields (NeRF) or more recently 3D Gaussian Splatting (3DGS), suffer significant degradation when the input becomes sparse.


From Chaos to Clarity: 3DGS in the Dark

Neural Information Processing Systems

Novel view synthesis from raw images provides superior high dynamic range (HDR) information compared to reconstructions from low dynamic range RGB images. However, the inherent noise in unprocessed raw images compromises the accuracy of 3D scene representation.



Appendix 310 A Implementation Details 311 In this section, we describe the implementation details of our algorithm for training on the training

Neural Information Processing Systems

We utilize the official implementation of TD-MPC [11] and MoDem [8] To update the SAE, we collect online data using a buffer with a size of 256. The following is the network architecture of the first STN block inserted into the encoder of TD-MPC. The following is the network architecture of the first STN block inserted into the encoder of MoDem. Hyperparameter V alue Discount factor 0.99 Image size 84 84 (TD-MPC) 224 224 (MoDem) Frame stack 3 (TD-MPC) 2 (MoDem) Action repeat 1 (xArm) 2 (Adroit, Finger, and Walker in DMControl) 4 (otherwise) Data augmentation 4 pixel image shifts (TD-MPC) 10 pixel image shifts (MoDem) Seed steps 5000 Replay buffer size Unlimited Sampling technique PER ( α = 0.6, β = 0.4) Planning horizon 5 Latent dimension 50 Learning rate 1e-3 (TD-MPC) 3e-4 (MoDem) Optimizer ( θ) Adam ( β The camera position varies continuously. For xArm, the mean of the distribution is 0, the standard deviation is 0.4, and we constrain the noise For DMControl, we modify the FOV from 45 to 53.


A LoD of Gaussians: Unified Training and Rendering for Ultra-Large Scale Reconstruction with External Memory

Windisch, Felix, Köhler, Thomas, Radl, Lukas, Steiner, Michael, Schmalstieg, Dieter, Steinberger, Markus

arXiv.org Artificial Intelligence

Gaussian Splatting has emerged as a high-performance technique for novel view synthesis, enabling real-time rendering and high-quality reconstruction of small scenes. However, scaling to larger environments has so far relied on partitioning the scene into chunks -- a strategy that introduces artifacts at chunk boundaries, complicates training across varying scales, and is poorly suited to unstructured scenarios such as city-scale flyovers combined with street-level views. Moreover, rendering remains fundamentally limited by GPU memory, as all visible chunks must reside in VRAM simultaneously. We introduce A LoD of Gaussians, a framework for training and rendering ultra-large-scale Gaussian scenes on a single consumer-grade GPU -- without partitioning. Our method stores the full scene out-of-core (e.g., in CPU memory) and trains a Level-of-Detail (LoD) representation directly, dynamically streaming only the relevant Gaussians. A hybrid data structure combining Gaussian hierarchies with Sequential Point Trees enables efficient, view-dependent LoD selection, while a lightweight caching and view scheduling system exploits temporal coherence to support real-time streaming and rendering. Together, these innovations enable seamless multi-scale reconstruction and interactive visualization of complex scenes -- from broad aerial views to fine-grained ground-level details.



Structure Consistent Gaussian Splatting with Matching Prior for Few-shot Novel View Synthesis

Neural Information Processing Systems

Despite the substantial progress of novel view synthesis, existing methods, either based on the Neural Radiance Fields (NeRF) or more recently 3D Gaussian Splatting (3DGS), suffer significant degradation when the input becomes sparse.